53 research outputs found

    RUPNet: Residual upsampling network for real-time polyp segmentation

    Full text link
    Colorectal cancer is among the most prevalent cause of cancer-related mortality worldwide. Detection and removal of polyps at an early stage can help reduce mortality and even help in spreading over adjacent organs. Early polyp detection could save the lives of millions of patients over the world as well as reduce the clinical burden. However, the detection polyp rate varies significantly among endoscopists. There is numerous deep learning-based method proposed, however, most of the studies improve accuracy. Here, we propose a novel architecture, Residual Upsampling Network (RUPNet) for colon polyp segmentation that can process in real-time and show high recall and precision. The proposed architecture, RUPNet, is an encoder-decoder network that consists of three encoders, three decoder blocks, and some additional upsampling blocks at the end of the network. With an image size of 512×512512 \times 512, the proposed method achieves an excellent real-time operation speed of 152.60 frames per second with an average dice coefficient of 0.7658, mean intersection of union of 0.6553, sensitivity of 0.8049, precision of 0.7995, and F2-score of 0.9361. The results suggest that RUPNet can give real-time feedback while retaining high accuracy indicating a good benchmark for early polyp detection.Comment: Accepted SPIE Medical Imaging 202

    TransRUPNet for Improved Out-of-Distribution Generalization in Polyp Segmentation

    Full text link
    Out-of-distribution (OOD) generalization is a critical challenge in deep learning. It is specifically important when the test samples are drawn from a different distribution than the training data. We develop a novel real-time deep learning based architecture, TransRUPNet that is based on a Transformer and residual upsampling network for colorectal polyp segmentation to improve OOD generalization. The proposed architecture, TransRUPNet, is an encoder-decoder network that consists of three encoder blocks, three decoder blocks, and some additional upsampling blocks at the end of the network. With the image size of 256×256256\times256, the proposed method achieves an excellent real-time operation speed of \textbf{47.07} frames per second with an average mean dice coefficient score of 0.7786 and mean Intersection over Union of 0.7210 on the out-of-distribution polyp datasets. The results on the publicly available PolypGen dataset (OOD dataset in our case) suggest that TransRUPNet can give real-time feedback while retaining high accuracy for in-distribution dataset. Furthermore, we demonstrate the generalizability of the proposed method by showing that it significantly improves performance on OOD datasets compared to the existing methods

    SynergyNet: Bridging the Gap between Discrete and Continuous Representations for Precise Medical Image Segmentation

    Full text link
    In recent years, continuous latent space (CLS) and discrete latent space (DLS) deep learning models have been proposed for medical image analysis for improved performance. However, these models encounter distinct challenges. CLS models capture intricate details but often lack interpretability in terms of structural representation and robustness due to their emphasis on low-level features. Conversely, DLS models offer interpretability, robustness, and the ability to capture coarse-grained information thanks to their structured latent space. However, DLS models have limited efficacy in capturing fine-grained details. To address the limitations of both DLS and CLS models, we propose SynergyNet, a novel bottleneck architecture designed to enhance existing encoder-decoder segmentation frameworks. SynergyNet seamlessly integrates discrete and continuous representations to harness complementary information and successfully preserves both fine and coarse-grained details in the learned representations. Our extensive experiment on multi-organ segmentation and cardiac datasets demonstrates that SynergyNet outperforms other state of the art methods, including TransUNet: dice scores improving by 2.16%, and Hausdorff scores improving by 11.13%, respectively. When evaluating skin lesion and brain tumor segmentation datasets, we observe a remarkable improvement of 1.71% in Intersection-over Union scores for skin lesion segmentation and of 8.58% for brain tumor segmentation. Our innovative approach paves the way for enhancing the overall performance and capabilities of deep learning models in the critical domain of medical image analysis.Comment: Accepted at WACV 202

    COROID: A Crowdsourcing-based Companion Drones to Tackle Current and Future Pandemics

    Get PDF
    Due to the current COVID-19 virus, which has already been declared a pandemic by the World Health Organization (WHO), we are witnessing the greatest pandemic of the decade. Millions of people are being infected, resulting in thousands of deaths every day across the globe. Even the world’s best healthcare-providing countries could not handle the pandemic because of the strain of treating thousands of patients at a time. The count of infections and deaths is increasing at an alarming rate because of the spread of the virus. We believe that innovative technologies could help reduce pandemics to a certain extent until we find a definite solution from the medical field to handle and treat such pandemic situations. Technology innovation has the potential to introduce new technologies that could support people and society during these difficult times. Therefore, this paper proposes the idea of using drones as a companion to tackle current and future pandemics. Our COROID drone is based on the principle of crowdsourcing sensors data of the public's smart devices, which can correlate the reading of the infrared cameras equipped on COROID drones. To the best of our knowledge, this concept has yet to be investigated either as a concept or as a product. Therefore, we believe that the COROID drone is innovative and has a huge potential to tackle COVID-19 and future pandemics.acceptedVersio

    Pathological Brain Detection Using Weiner Filtering, 2D-Discrete Wavelet Transform, Probabilistic PCA, and Random Subspace Ensemble Classifier

    Get PDF
    Accurate diagnosis of pathological brain images is important for patient care, particularly in the early phase of the disease. Although numerous studies have used machine-learning techniques for the computer-aided diagnosis (CAD) of pathological brain, previous methods encountered challenges in terms of the diagnostic efficiency owing to deficiencies in the choice of proper filtering techniques, neuroimaging biomarkers, and limited learning models. Magnetic resonance imaging (MRI) is capable of providing enhanced information regarding the soft tissues, and therefore MR images are included in the proposed approach. In this study, we propose a new model that includes Wiener filtering for noise reduction, 2D-discrete wavelet transform (2D-DWT) for feature extraction, probabilistic principal component analysis (PPCA) for dimensionality reduction, and a random subspace ensemble (RSE) classifier along with the K-nearest neighbors (KNN) algorithm as a base classifier to classify brain images as pathological or normal ones. The proposed methods provide a significant improvement in classification results when compared to other studies. Based on 5×5 cross-validation (CV), the proposed method outperforms 21 state-of-the-art algorithms in terms of classification accuracy, sensitivity, and specificity for all four datasets used in the study

    DoubleU-Net: A Deep Convolutional Neural Network for Medical Image Segmentation

    Get PDF
    Semantic image segmentation is the process of labeling each pixel of an image with its corresponding class. An encoder-decoder based approach, like U-Net and its variants, is a popular strategy for solving medical image segmentation tasks. To improve the performance of U-Net on various segmentation tasks, we propose a novel architecture called DoubleU-Net, which is a combination of two U-Net architectures stacked on top of each other. The first U-Net uses a pre-trained VGG-19 as the encoder, which has already learned features from ImageNet and can be transferred to another task easily. To capture more semantic information efficiently, we added another U-Net at the bottom. We also adopt Atrous Spatial Pyramid Pooling (ASPP) to capture contextual information within the network. We have evaluated DoubleU-Net using four medical segmentation datasets, covering various imaging modalities such as colonoscopy, dermoscopy, and microscopy. Experiments on the MICCAI 2015 segmentation challenge, the CVC-ClinicDB, the 2018 Data Science Bowl challenge, and the Lesion boundary segmentation datasets demonstrate that the DoubleU-Net outperforms U-Net and the baseline models. Moreover, DoubleU-Net produces more accurate segmentation masks, especially in the case of the CVC-ClinicDB and MICCAI 2015 segmentation challenge datasets, which have challenging images such as smaller and flat polyps. These results show the improvement over the existing U-Net model. The encouraging results, produced on various medical image segmentation datasets, show that DoubleU-Net can be used as a strong baseline for both medical image segmentation and cross-dataset evaluation testing to measure the generalizability of Deep Learning (DL) models

    EMIT-Diff: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model

    Full text link
    Large-scale, big-variant, and high-quality data are crucial for developing robust and successful deep-learning models for medical applications since they potentially enable better generalization performance and avoid overfitting. However, the scarcity of high-quality labeled data always presents significant challenges. This paper proposes a novel approach to address this challenge by developing controllable diffusion models for medical image synthesis, called EMIT-Diff. We leverage recent diffusion probabilistic models to generate realistic and diverse synthetic medical image data that preserve the essential characteristics of the original medical images by incorporating edge information of objects to guide the synthesis process. In our approach, we ensure that the synthesized samples adhere to medically relevant constraints and preserve the underlying structure of imaging data. Due to the random sampling process by the diffusion model, we can generate an arbitrary number of synthetic images with diverse appearances. To validate the effectiveness of our proposed method, we conduct an extensive set of medical image segmentation experiments on multiple datasets, including Ultrasound breast (+13.87%), CT spleen (+0.38%), and MRI prostate (+7.78%), achieving significant improvements over the baseline segmentation methods. For the first time, to our best knowledge, the promising results demonstrate the effectiveness of our EMIT-Diff for medical image segmentation tasks and show the feasibility of introducing a first-ever text-guided diffusion model for general medical image segmentation tasks. With carefully designed ablation experiments, we investigate the influence of various data augmentation ratios, hyper-parameter settings, patch size for generating random merging mask settings, and combined influence with different network architectures.Comment: 15 page
    • …
    corecore